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Monotonicity is a simple yet significant qualitative chajracteristic. We consider the problem of segmenting a sequence in up to K segments. 
We want segments to be as monotonic as possible and to alternate signs. We propose a quality metric for this problem using the loo norm, 
and we present an optimal linear time algorithm based on novel formalism. Moreover, given a precomputation in time 0(n log n) consisting 
of a labeling of all extrema, we compute any optimal segmentation in constant time. We compare experimentally its performance to two 
piecewise linear segmentation heuristics (top-down and bottom-up) . We show that our algorithm is faster and more accurate. Applications 
include pattern recognition and qualitative modeling. 
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1. Introduction 

Monotonicity is one of the most natural and important qualitative properties for sequences of data points. 
It is easy to determine where the values are strictly going up or down, but we only want to identify 
significant monotonicity. For example, the drop from 2 to 1.9 in the array 0,1,2,1.9,3,4 might not be 
significant and might even be noise-related. The quasi-monotonic segmentation problem is to determine 
where the data is approximately increasing or decreasing. 
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In practical applications, sequences of values can be quite large: it is not uncommon to have sensors 
record data at 10 kHz or more, thus generating terabytes of data and billions of data points. As a dimen- 
sionality reduction step [2], segmentation divides the data into intervals having homogeneous character- 
istics (flatness, constant slope [3], unimodality [4], monotonicity [5,6], step, ramp or impulse [7], and so 
on). The segmentation points can also be used as markers to indicate a qualitative change in the data. 
Other applications include frequent pattern mining [8] and time series classification [9]. For qualitative 
reasoning [10], piecewise monotonic segmentation is especially important as it provides a symbolic model 
describing system behavior in terms of increasing and decreasing relations between variables. 

There is a trade-off between the number of segments and the approximation error. Some segmentation 
algorithms [5] give a segmentation having no more than K segments while attempting to minimize the 
error e; other algorithms [6] attempt to minimize the number of segments (K) given an upper bound on 
the error e. We are concerned with the first type of algorithm in this paper. 

Using dynamic programming or other approaches, most segmentation problems can be solved in time 
O(n^). Other solutions to this problem, using machine learning to classify the pairs of data points [10], are 
even less favorable since they have higher complexity. However, it is common for sequence of data points to 
be massive and segmentation algorithms have to have complexity close to 0(n) to be competitive. While 
approximate linear regression segmentation algorithms can be 0(n), we show that using a linear regression 
error to segment according to monotonicity is not an ideal solution. 

We present a metric for the quasi-monotonic segmentation problem called the Optimal Monotonic Ap- 
proximation Function Error (OMAFE); this metric differs from previously introduced 0PM AFE metric [5] 
since it applies to all segmentations and not just "extremal" segmentations. We formalize the novel concept 
of a maximal *-pair and shows that it can be used to define a unique labeling of the extrema leading to 
an optimal segmentation algorithm. We also present an optimal linear time algorithm to solve the quasi- 
monotonic segmentation problem given a segment budget together with an experimental comparison to 
quantify the benefits of our algorithm. 
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2. Monotonicity Error Metric (OMAFE) 

Finding the best piecewise monotonic approximation can be viewed as a classical functional approximation 
problem [11], but we are concerned only with discrete sequences. 

Suppose n samples noted F : D = {xi, . . . , Xn} K with x\ < X2 < ■ ■ ■ Xn- We define, -^[a,6] as the 
restriction of F over D n [a, 6]. We seek the best monotonic (increasing or decreasing) function / : M — M 
approximating F. Let 0| (resp. Oj^) be the set of all monotonic increasing (resp. decreasing) functions. The 
Optimal Monotonic Approximation Function Error (OMAFE) is min/gQ maxj;g£) |/(x) — F{x)\ 
where O is either fij or fij^. 

The segmentation of a set Diss, sequence S = Xi, . . . , Xk of intervals in M with [min D, max D] = (J^ Xi 
such that maxXj = minXj+i G D and fl = for j 7^ z + 1, z — 1. Alternatively, we can define a 
segmentation from the set of points Xj n Xj+i = {j/j+i}, y\ = minXi, and yK+i = maxX;^. Given F : 
{xi, . . . ,x„} — ^ M and a segmentation, the Optimal Monotonic Approximation Function Error (OMAFE) 
of the segmentation is max, OMAFE(F|x.) where the monotonicity type (increasing or decreasing) of the 
segment X^ is determined by the sign of F(maxXj) — F(minXj). Whenever F(maxXj) = F(minXj), 
we say the segment has no direction and the best monotonic approximation is just the fiat function 
having value (maxi^jx^ — minF|xJ/2. The error is computed over each interval independently; optimal 
monotonic approximation functions are not required to agree at max X^ = min Xj+i . Segmentations should 
alternate between increasing and decreasing, otherwise sequences such as 0,2, 1,0,2 can be segmented as 
two increasing segments 0,2, 1 and 1,0,2: we consider it is natural to aggregate segments with the same 
monotonicity. 

We solve for the best monotonic function as follows. If we seek the best monotonic increasing function, 
we first define J^{x) = max{F(y) : y < x} (the maximum of all previous values) and f ^{x) = min{F(y) : 
y > x] (the minimum of all values to come) . If we seek the best monotonic decreasing function, we define 
f l{x) = max{F(?/) : y > x} (the maximum of all values to come) and f ^{x) = min{F(y) : y < x} (the 
minimum of all previous values). These functions, which can be computed in linear time, are all we need to 
solve for the best approximation function as shown by the next theorem which is a well-known result [12]. 
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Theorem 2.1 Given F : D = {xi, . . . ,Xn} ^ M, a best monotonic increasing approximation function to 
F is fi = (/-[^ + and a best monotonic decreasing approximation function is fi = {f^ + f ^)/^- The 

corresponding error (OMAFE) is maxa;g£)(|/-|.(x) — f^{x)\)/2 or maxa;g£)(|/j^(a;) — f ^{x)\)/2 respectively. 

The implementation of the algorithm suggested by the theorem is straight-forward. Given a segmentation, 
we can compute the OMAFE in 0{n) time using at most two passes. 

3. A Scale-Beised Algorithm for Qucisi-Monotonic Segmentation 

We use the following proposition to prove that the segmentations we generate are optimal (see Theo- 
rem 3.8). 

Proposition 3.1 A segmentation yi,. . ■ ,yK+i of F : D = {xi, . . . — > M with alternating monotonic- 
ity has a minimal OMAFE e for a number of alternating segments K if 

(i) F{yi) = maxF([yj_i,j/j+i]) or F{yi) = minF([yj_i, y^+i]) for i = 2, . . .,K; 
(a) in all intervals [y^, yi+i] for i = 1, . . . ,K, there exists zi, Z2 such that \F{z2) — -^(-^i)! > 2e. 

Proof Let the original segmentation be the intervals , . . . , Sk and consider a new segmentation with 
intervals Ti, . . . , T^. Assume that the new segmentation has lower error (as given by OMAFE). Let Si = 
[yi,yi+i] and Tj = 

If any segment contains a segment Sj, then the existence of Zi,Z2 in [yj,yj^i] such that \F{z2) — 
F(^;i)| > 2e and OMAFE(Tot) < e implies that Tj^ and Sj have the same monotonicity. 

We show that each pair of intervals Si, Ti has nonempty intersection. Suppose not, and let i be the 
smallest index such that Si C Tj_i. Since Si and Tj-i have the same monotonicity, for each j < i, Sj and 
Tj have opposite monotonicity. Now consider the i — 1 intervals Ti, . . . , Tj_i and the i points yi, . . . , y^. At 
least one interval contains two consecutive points; choose the largest j < i such that Tj contains 
But then Sj C Tj, contradicting at least one of the assumptions \F{z2) — F{zi)\ > 2e for zi,Z2 € Si and 
OMAFE(Tj) < e. 

It now follows that each pair of intervals Si,Ti has the same monotonicity. 
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Figure 1. A (5-pair. 

Since OMAFE(r) < 0MAFE(5), we can choose an index j such that OMAFE(Tj) < OMAFE(S'j). 
We show that there exists another index p such that OMAFE(rp) > OMAFE(S'j), thus contradicting 
OMAFE(T) < 0MAFE(5). Suppose Sj is increasing; the proof is similar for the opposite case. Then 
there exist x < z £ Sj such that F{x) - F{z) = 2 x OMAFE{Sj). From OMAFE(rj) < OMAFE(S'j) 
it follows that at least one of x or z lies in Sj — Tj, and hence F(x) — F{yj) > 2 x OMAFE(S'j) or 
F[yj+i) - F{z) > 2 X OMAFE(S'j). Thus OMAFE(rp) > OMAFE(S'j) for either p = j - 1 or p = j + 1. □ 

For simplicity, we assume -F has no consecutive equal values, i.e. F{xi) ^ ^(xj+i) for i = 1, . . . ,n — 1; 
our algorithms assume all but one of consecutive equal values values have been removed. We say Xi is a 
maximum if i 7^ 1 implies F{xi) > F(xj_i) and if i 7^ n implies F{xi) > Minima are defined 

similarly. 

Our mathematical approach is based on the concept of (^-pair [13] (see Fig. 1): 

Definition 3.2 The tuple x,y {x < y ^ D) is a (5-pair (or a pair of scale S) for F if \F{y) — F{x)\ > 5 and 
for all z £ D, X < z < y implies \F(z) — F{x)\ < 6 and \F(y) — F(z)\ < 5. A (5-pair's direction is increasing 
or decreasing according to whether F{y) > F{x) or F{y) < F(x). 

(5-Pairs having opposite directions cannot overlap but they may share an end point. 5-Pairs of the same 
direction may overlap, but may not be nested. We use the term "*-pair" to indicate a 5-pair having an 
unspecified 5. We say that a *-pair is significant at scale 6 if it is of scale 6' for 5' > 6. From a topological 
viewpoint, a *-pair is the pairing of critical points used to determine each extremum's persistence [14]. 
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We define 5-monotonicity as follows: 

Definition 3.3 Let X be an interval, F is S-monotonic on X if all (5-pairs in X have the same direction; 
F is strictly 5-monotonic when there exists at least one such (5-pair. In this case: 

• F is S- increasing on X if X contains an increasing (5-pair. 

• F is 6- decreasing on X if X contains a decreasing (5-pair. 

A 5-monotonic interval X satisfies OMAFE(X) < 5/2. We say that a *-pair x, y is maximal if whenever 
2;i,2;2 is a *-pair of a larger scale in the same direction containing x, y, then there exists a *-pair wi,W2 
of an opposite direction contained in 2:1,2:2 and containing x, y. For example, the sequence 1,3,2,4 has 2 
maximal *-pairs: 1,4 and 3,2. Maximal *-pairs of opposite direction may share a common point, whereas 
maximal *-pairs of the same direction may not. Maximal *-pairs cannot overlap, meaning that it cannot 
be the case that exactly one end point of a maximal *-pair lies strictly between the end points of another 
maximal *-pair; either neither point lies strictly between or both do. In the case that both do, we say that 
the one maximal *-pair properly contains the other. All *-pairs must be contained in a maximal *-pair. 

Lemma 3.4 The smallest maximal *-pair containing a *-pair must be of the same direction. 

Proof Suppose a *-pair is immediately contained in a maximal *-pair W. Suppose W is not in the same 
direction, then within W, seek the largest *-pair in the same direction as P and containing P, then it must 
be a maximal *-pair in D since maximal *-pairs of different directions cannot overlap. □ 

The first and second point of a maximal *-pair are extrema and the reverse is true as well as shown by 
the next lemma. 

Lemma 3.5 Every extremum is either the first or second point of a maximal *-pair. 

Proof The case x = xi or x = x„ follows by inspection. Otherwise, x is the end point of a left and a right 
*-pair. Each *-pair must immediately belong to a maximal *-pair of same direction: a *-pair P is contained 
in a maximal *-pair M of same direction and there is no maximal *-pair M' of opposite direction such 
that P C M' C M. Let M' and be the maximal *-pairs immediately containing the left and right 
*-pair of X. Suppose neither and have x as a end point. Suppose M' C M^, then the right *-pair 
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is not immediately contained in M^', a contradiction. The result follows by symmetry. □ 

Our approach is to label each extremum in F with a scale parameter 6 saying that this extremum is 
"significant" at scale S and below. Our intuition is that by picking extrema at scale S, we should have a 
segmentation having error less than S/2. 

Definition 3.6 The scale labeling of an extremum x is the maximum of the scales of the maximal *-pairs 
for which it is an end point. 

For example, given the sequence 1,3,2,4 with 2 maximal *-pairs (1,4 and 3,2), we would give the 
following labels in order 3, 1, 1, 3. 

Definition 3.7 Given (5 > 0, a maximal alternating sequence of S -extrema Y = yi . . .y^+i is a sequence 
of extrema each having scale label at least S, having alternating types (maximum/minimum), and such 
that there exists no sequence properly containing Y having these same properties. From Y we define a 
maximal alternating 6 -segmentation of D by segmenting at the points xi,y2 ■ ■ ■ yK,Xn- 

Theorem 3.8 Given S > 0, let P = Si...Sk be a maximal alternating S -segmentation derived from 
maximal alternating sequence yi ■ ■ ■ yx+i of 6-extrema. Then any alternating segmentation Q having 
OMAFE(Q) < OMAFE(P) has at least K -\- 1 segments. 

Proof We show that conditions A and B of Proposition 3.1 are satisfied with e = OMAFE(P). 

First we show that each segment Si is (5-monotone; from this we conclude that OMAFE(P) < S/2. 
Intervals [a;i,yi] and [yK,Xn\ contain no maximal *-pairs of scale S or larger, and therefore contain no 
*-pairs of scale S or larger. Similarly, no [yi, y^+i] contains an opposite-direction significant *-pair. 

Condition A: Follows from J-monotonicity of each Si and maximal *-pairs not overlapping. 

Condition B: We show that \F{yi+i) - F{yi)\ > S > 2x OMAFE(P). If i = 1, then yi must begin 
an maximal *-pair, and the maximal *-pair must end with since maximal *-pairs cannot overlap. 
The case i -\- 1 = k is similar. Otherwise, since maximal *-pairs cannot overlap, each yi,yi^i is either 
a maximal *-pair of scale 6 or larger or there exist indices j and k, j < i and k > i -\- 1 such that 
yj, yi is a maximal *-pair of scale at least 6, and j/j+i, is a maximal *-pair of scale at least 6. These two 
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maximal *-pairs have the same direction, and that this is opposite to the direction of [yjyi+i]. Now suppose 
\F{yi) - F{yi+i)\ < S. Then yj,yk is a *-pair properly containing yj, yi and yi+i, yk- But neither yj,yi nor 
yi+ijVk can be properly contained in a *-pair of opposite direction lying within yj,yk, thus contradicting 
their maximality and proving the claim. □ 

Sequences of extrema labeled at least S are generally not maximal alternating. For example the sequence 
0, 10, 9, 10, is scale labeled 10, 10, 1, 10, 10. However, a simple relabehng of certain extrema can make them 
maximal alternating. Consider two same-sense extrema zi < Z2 such that lying between them there exists 
no extremum having scale at least as large as the minimum of the two extrema's scales. We must have 
F{zi) = F{z2), since otherwise the point upon which F has the lesser value could not be the endpoint of 
a maximal *-pair. This is the only situation which causes choice when constructing a maximal alternating 
sequence of 5-extrema. To eliminate this choice, replace the scale label on zi with the largest scale of the 
opposite-sense extrema lying between them. In the next section, Algorithm 1 incorporates this re-labeling 
making Algorithm 2 simple and efficient. 

3.1. Computing a Scale Labeling Efficiently 

Algorithm 1 (next page) produces a scale labeling in linear time. Extrema from the original data are 
visited in order, and they alternate (maxima/minima) since we only pick one of the values when there are 
repeated values (such as 1,1,1). 

The algorithm has a main loop (lines 5 to 12) where it labels extrema as it identifies extremal *-pairs, 
and stack the extrema it cannot immediately label. At all times, the stack (line 3) contains minima and 
maxima in strictly increasing and decreasing order respectively. Also at all times, the last two extrema 
at the bottom of the stack are the absolute maximum and absolute minimum (found so far) . Observe that 
we can only label an extrema as we find new extremal *-pairs (lines 7, 10, and 14). 

• If the stack is empty or contains only one extremum, we simply add the new extremum (line 12). 

• If there are only 2 extrema zi,Z2 in the stack and we found either a new absolute maximum or new 
absolute minimum (z^), we can pop and label the oldest one (2:1) (lines 9, 10, and 11) because the old 
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pair (-21, -^2) forms a maximal *-pair and thus must be bounded by extrema having at least the same 
scale while the oldest value (zi) does not belong to a larger maximal *-pair. Otherwise, if there are only 
2 extrema zi,Z2 in the stack and the new extrema Z3 satisfies Z3 G (min(zi, 2:2), max(zi, Z2)), then we 
add it to the stack since no labeling is possible yet. 
• While the stack contains more than 2 extrema (lines 6, 7 and 8), we consider the last three points 
on the stack {33,82, si) where si is the last point added. Let z be the value of the new extrema. If 
z G (min(si, S2), inax(si, S2)), then it is simply added to the stack since we cannot yet label any of these 
points; we exit the while loop. Otherwise, we have a new maximum (resp. minimum) exceeding (resp. 
lower) or matching the previous one on stack, and hence si, S2 is a maximal *-pair. z ^ S2, then S3, z 
is a maximal *-pair and thus, S2 cannot be the end of a maximal *-pair and si cannot be the beginning 
of one, hence both S2 and si are labeled, li z = S2 then we have successive maxima or minima and the 
same labeling as z ^ S2 applies. 

During the "unstacking" (lines 13 and following), we visit a sequence of minima and maxima forming 
increasingly larger maximal *-pairs. 

The algorithm runs in time 0(n) (independent of K). Indeed, for any index of an extremum, the condition 
at line 3.1 will evaluate once to false; moreover the condition at line 3.1 cannot evaluate to true more than 
0{n) times. 

Once the labeling is complete, we find K -\-2 extrema having largest scale in time 0{nK) using 0{K) 
memory, then we remove all extrema having the same scale as the smallest scale in these K-\-2 extrema (re- 
moving at least one), we replace the first and the last extrema by and n— 1 respectively (see Algorithm 2). 
The result is an optimal segmentation having at most K segments. 

Alternatively, if we plan to resegment the time series several times with different values of if, we can 
sort all extrema by their label in time 0(n log n), and compute in time 0{n) an auxiliary structure on the 
sorted set so that when selecting the zth item in the sorted list (dj), we obtain the index j of the earliest 
occurrence of this scale in the list (scale((ij) = scaled^ and scale((ij) < scale((ij_i) if j > 0) in constant 
time. Hence, we can segment any time series optimally in constant time given this precomputation in time 
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Algorithm 1 Algorithm to compute the scale labeling in 0(n) time. 

1; INPUT: an array d containing the y values indexed from to n — 1, repeated consecutive values have 

been removed 
2: OUTPUT: a scale labeling for all extrema 

3: 5 <— empty stack, First(S') is the value on top, Second(S') is the second value 
4: define 6{d,S) = |dFirst(5) " ^Second(5)l 

5: for e index of an extremum in d, e's are visited in increasing order do 

6: while length(S') > 2 and (e is a minimum such that de < Second(S') or e is a maximum such that 

de > Second(S')) do 
7: label First(S') and Second(S') with S{d, S) 
8: pop stack S twice 
9: end v^hile 

10: if length(S') is 2 and (e is a minimum such that de < Second(>S') or e is a maximum such that 

de > Second(S')) then 
11: label Second(S') with S{d,S) 
12: remove Second(>S') from stack S 
13: end if 
14: stack e to 5 
15: end for 

16: while length of S* > 2 do 
17: label First (5) with 5{d, S) 
18: pop stack S 
19: end while 

20: label First(5') and Second(S') with 6{d,S) 



Algorithm 2 Given the scale labeling, this algorithm will return a segmentation using at most K segments. 
It is assumed that there are at least K + 1 extrema to begin with. 

INPUT: an array d containing the y values indexed from 1 to n 

INPUT: K a bound on the number of segments desired 
OUTPUT: unsorted segmentation points (a (^-segmentation) 
L empty array (capacity K + 3) 

for e is index of an extremum in d having scale 6, e are visited in increasing order do 

insert (e, 6) in L so that L is sorted by scale in decreasing order (sort on 6) using binary search 
if length of L is X + 3 then 

pop last(L) 
end if 

end for 

remove all elements of L having the scale of last(L) 

RETURN: Llic indexes in L re|")laciiig first one hy 1 and last one In* n 



0(n log n). 

Lemma 3.9 Given a precomputation in time O(nlogn) using 0{n) storage, for any desired upper bound 
on the number of segments K , we can compute the segmentation points of an optimal OMAFE, and the 
corresponding OMAFE value, in constant time. 

Hence, we can compute an OMAFE versus K plot in O(nlogn) time. 
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4. Experimental Results and Comparison to Piecewise Linear Segmentation Heuristics 

We compare our optimal 0{nK) algorithm with our implementations of two piecewise linear segmentation 
heuristics [3]: top-down, which runs in 0{nK) time (see Algorithm 3), and bottom-up which runs in 
0{n{n — K)) time (see Algorithm 4)- The top-down heuristic successively segments the data starting with 
only one segment, each time picking the segment with the worse linear regression error and finding the best 
segmentation point; the linear regression is not continuous from one segment to the other. The regression 
error can be computed in constant time if one has precomputed the range moments [15, 16]. The bottom- 
up heuristic starts with intervals containing only one data point and successively merge them, each time 
choosing the least expensive merge. By maintaining the segments in a doubly-linked list coupled with a 
heap or tree, it is possible to obtain a bottom-up heuristic with 0((n — K) logn) complexity, but it then 
uses much more memory and it is more difficult to implement. 

Once the piecewise linear segmentation is completed, we run through the segments and aggregate consec- 
utive segments having the same sign where the sign of a segment [y^, y^+i] is defined by F(y/j_|_i) — F{yk), 



Algorithm 3 Piecewise Linear Top-Down Segmentation Heuristic. 

INPUT: Time Scries {xi,yi) of length n 
INPUT: Desired number of segments K 

INPUT: Function E(p, q) computing linear fit error in range [xp, Xq] 

S ^ {l,n,E{0,n)) 

while \S\<K do 

find tuple e) in S with maximum last entry 

find minimum of E{i, I) + E{1 + 1, j) for I = i, . . . , j — 1 

remove tuple {i,j,e) from S 

insert tuples {i,l,E{i,l)) and {l,j,E{l-{- in S 
end while 

S contains the segmentation 



Algorithm 4 Piecewise Linear Bottom- Up Segmentation Heuristic. 

INPUT: Time Scries {xi,yi) of length n 
INPUT: Desired number of segments K 

INPUT: Function E{p, q) computing linear fit error in range [xp, Xq] 
5^ [0,0], [1,1], [2,2],..., [n,n] 
while \S\> K do 

find consecutive intervals in S, \pi,P2\ and [p2 + l^Ps], having minimal value E{pi,p3) — E{pi,p2) — 

E{p2 + l,P3) 

merge the two consecutive intervals 
end while 

S contains the segmentation 
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Figure 2. Time to segment a time series of lengtli n in K = 20 segments. 

setting to be a positive sign (increasing monotonicity) . 

We implemented all algorithms in Python (version 2.5) and ran the experiments on a 2.16 GHz Intel Core 
2 Duo processor with sufficient RAM (1 GB). Fig. 2 presents the relative speed of the various segmentation 
algorithms on time series of various lengths for a fixed number of segments (using randomly generated 
data). The timings reported include all pre-processing. 
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4.1. Electrocardiograms (ECG) 

ECGs have a well known monotonicity structure with 5 commonly identifiable extrema per pulse (reference 
points P, Q, R, S, and T) (see Fig. 3) though not all points can be easily identified on all pulses and the 
exact morphology can vary. We used freely available samples from the MIT-BIH Arrhythmia Database [17]. 
We only present our results over one sample (labeled "100. dat") since we found that results did not vary 
much between data samples. These ECG recordings used a sampling rate of 360 Hz per channel with 
11-bit resolution (see Fig. 4(a)). We keep the first 4000 samples (11 seconds) and about 14 pulses, and 
we do no preprocessing such as baseline correction. We can estimate that a typical pulse has about 5 
"easily" identifiable monotonic segments. Hence, out of 14 pulses, we can estimate that there are about 
70 significant monotonic segments, some of which match the domain-specific markers (reference points P, 
Q, R, S, and T). A qualitative description of such data is useful for pattern matching applications. 

The running time as a function of K is presented in Fig. 4(b). The scale-based segmentation implemen- 
tation is faster than our implementations of the piecewise linear heuristics. On such a long time series 
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Figure 3. Schema of an ECG pulse with commonly identified reference points (PQRST). 
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Figure 4. Results of experiments over ECG data. 
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(c) OMAFE vs. number of segments K 



(4000 samples), our implementation of the bottom-up heuristic is much slower than the alternatives. 

We want to determine how well the piecewise linear segmentation heuristics do comparatively. OMAFE 
is an absolute and not relative error measure, but because the range of the ECGs under consideration is 
roughly between 950 and 1150, we expect the OMAFE to never exceed 100 by much. The OMAFE with 
respect to the maximal number of segments {K) is given in Fig. 4(c): it is a "monotonicity spectrum." 
By counting on about 5 monotonic segments per pulse with a total of 14 pulses, there should about 
70 monotonic segments in the 4000 samples under consideration. We see that the decrease in OMAFE 
with the addition of new segments starts to level off between 50 and 70 segments as predicted. The addition 
of new segments past 70 {K > 70) has little impact. The scale-based algorithm is optimal, but also at 
least 3 times more accurate than the top-down algorithm for larger K and this is consistent over other 
data sets. In fact, the OMAFE becomes practically zero for K > 80 whereas the OMAFE of the top-down 
linear regression algorithm remains at around 20, which is still significant. The bottom-up heuristic is more 
accurate than the top-down heuristic, but it still has about twice the OMAFE for large K. OMAFE of 
the scale-based algorithm is a non increasing function of K, a consequence of optimality. 
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Figure 5. Results of experiments over daily temperature data. 



4.2. Temperature Recordings 

We consider the daily temperature recordings of the first of 35 weather stations in the MD*Base Daily 
temperature data set [18]^. Since we only have one year of recordings, only 365 data points are used 
(see Fig. 5(a)). We also give the running times (see Fig. 5(b)) and the accuracy (see Fig. 5(c)). Our 
implementation of the bottom-up heuristic is now much faster due to small size of the times series, but 
the OMAFE, while superior to the top-down heuristic, exhibits a spurious spike near K = 40, showing the 
danger of relying on a piecewise linear heuristic to study the monotonicity of a data set. Considering the 
OMAFE of our scale-based algorithm, we notice that the accuracy increases slowly after K = 10. 

4.3. Synthetic Random Walk Data 

Random walks are often used as models for common time series such as stock prices. We generated a 
random walk {i, yj)j=i,...,4ooo using the formula yj+i = yi + e where e N{0, 1) (see Fig. 6(a)). The running 
times are nearly identical to the EGG case, as is expected since the time series have the same length. 
However, the OMAFE differs (see Fig. 6(c)): using our optimal algorithm, the curve is smooth with no 
sharp drop. Meanwhile, the bottom-up heuristic exhibits another spurious spike in the OMAFE (around 
K = 20) while it provides the optimal segmentation at K = 5. 



the data is attributed to Ramsay and Silverman [19] 
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Figure 6. Results of experiments over random walk. 

5. Conclusion and Future Work 



We presented optimal and fast algorithms to compute the best piecewise monotonic segmentation in time 
0{n) and the complete OMAFE-versus-X spectrum in time 0(n log n). Our experimental results suggest 
that one should be careful when deriving monotonicity information from piecewise linear segmentation 
heuristics. Future work will focus on choosing the optimal number of segments for given applications. We 
also plan to investigate the applications of the monotonicity spectrum as a robust analysis. Further work 
to integrate flat segments is needed [5, 16]. 
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